A Machine Learning Approach to Workflow Prioritization
Evaluating the Peformance of Extreme Gradient Boosting
S&P Global
Choose how many links you would like to display:
1 Link
2 Links
3 Links
4 Links
5 Links
Background
Our team was tasked with creating a model that could predict whether or not processes performed would result in a database update.
In the sample dataset, approixmately 50% of the processes did not produce an update. These can be appropriately classified as unproductive work.
We discovered that combining a multitude of features, along with combining some previous prioritization metrics, we were able to nearly perfectly predict whether or not these processes would result in an update, thus allowing our client to substantially reduce wasted man-hours in the future.
Diagram Notes
Values truncated to 1 decimal to aid performance.
Available features limited to non-proprietary information.
Accordingly, features A:C are placeholders for proprietary information.
Legend
Results (1 = Update ; 0 = No Update)
Probability Score (Proposed Metric)
Process Priority (Existing Metric)
Feature A (Dummy)
Feature B (Dummy)
Feature C (Dummy)
Extensions
Links could be dynamically created using d3 for preprocessing rather than Pandas -- this would also help with expanding the feature set
If dynamically calculated, link ordering could be specified from a dropdown menu, allowing better data exploration.
In implementation, a live connection to a SQL DB could also allow for heightened ease and pace of data understanding within S&P Global.